Exploiting statistically significant dependent rules for associative classification

نویسندگان

Jundong Li

Osmar R. Zaïane

چکیده

Established associative classification algorithms have shown to be very effective in handling categorical data such as text data. The learned model is a set of rules that are easy to understand and can be edited. However, they still suffer from the following limitations: first, they mostly use the support-confidence framework to mine classification association rules which require the setting of some confounding parameters; second, the lack of statistical dependency in the used framework may lead to the omission of many interesting rules and the detection of meaningless rules; third, the rule generation process usually generates a sheer number of rules which puts in question the interpretability and readability of the learned associative classification model. In this paper, we propose a novel associative classifier, SigDirect, to address the above problems. In particular, we use Fisher’s exact test as a significance measure to directly mine classification association rules by some effective pruning strategies. Without any threshold settings like minimum support and minimum confidence, SigDirect is able to find nonredundant classification association rules which express a statistically significant dependency between a set of antecedent items and a consequent class label. To further reduce the number of noisy rules, we present an instance-centric rule pruning strategy to find a subset of rules of high quality. At last, we propose and investigate various rule classification strategies to achieve a more accurate classification model. Experimental results The work was done when the author was at University of Alberta

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Associations between Class Labels in Multi-label Classification

Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...

متن کامل

School of IT Technical Report USING SIGNIFICANT, POSITIVELY ASSOCIATED AND RELATIVELY CLASS CORRELATED RULES FOR ASSOCIATIVE CLASSIFICATION OF IMBALANCED DATASETS

The application of association rule mining to classification has led to a new family of classifiers which are often referred to as “Associative Classifiers (ACs)”. The advantage of ACs is that they are rule-based and thus lend themselves to an easier interpretation. Another advantage that ACs enjoy is that they are based on a global search criterion, unlike other rule-based classifiers – e.g. d...

متن کامل

Review and Comparison of Associative Classification Data Mining Approaches

Associative classification (AC) is a data mining approach that combines association rule and classification to build classification models (classifiers). AC has attracted a significant attention from several researchers mainly because it derives accurate classifiers that contain simple yet effective rules. In the last decade, a number of associative classification algorithms have been proposed ...

متن کامل

Review and Comparison of Associative Classification Data Mining Approaches

متن کامل

Associative Classification Based on Artificial Immune System

Associative classification algorithms which are based on association rules have performed well compared with other classification approaches. However a fundamental limitation with these classification algorithms is that the search space of candidate rules is very large and the processes of rule discovery and rule selection are conducted separately. This paper proposes an approach called ARMBIS,...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Intell. Data Anal.

دوره 21 شماره

صفحات -

تاریخ انتشار 2017

Exploiting statistically significant dependent rules for associative classification

نویسندگان

چکیده

منابع مشابه

Exploiting Associations between Class Labels in Multi-label Classification

School of IT Technical Report USING SIGNIFICANT, POSITIVELY ASSOCIATED AND RELATIVELY CLASS CORRELATED RULES FOR ASSOCIATIVE CLASSIFICATION OF IMBALANCED DATASETS

Review and Comparison of Associative Classification Data Mining Approaches

Review and Comparison of Associative Classification Data Mining Approaches

Associative Classification Based on Artificial Immune System

عنوان ژورنال:

اشتراک گذاری